A simple infinite topic mixture for rich graphs and relational data

نویسندگان

  • Janne Sinkkonen
  • Juuso Parkkinen
  • Samuel Kaski
چکیده

We propose a simple component or “topic” model for relational data, that is, for heterogeneous collections of co-occurrences between categorical variables. Graphs are a special case, as collections of dyadic co-occurrences (edges) over a set of vertices. The model is especially suitable for finding global components from collections of massively heterogeneous data, where encoding all the relations to a more sophisticated model becomes cumbersome, as well as for quick-anddirty modeling of graphs enriched with, e.g., link properties or nodal attributes. The model is here estimated with collapsed Gibbs sampling, which allows sparse data structures and good memory efficiency for large data sets. Other inference methods should be straightforward to implement. We demonstrate the model with various medium-sized data sets (scientific citation data, MovieLens ratings, protein interactions), with brief comparisons to a full relational model and other approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Social Network Mining with Nonparametric Relational Models

Statistical relational learning (SRL) provides effective techniques to analyze social network data with rich collections of objects and complex networks. Infinite hidden relational models (IHRMs) introduce nonparametric mixture models into relational learning and have been successful in many relational applications. In this paper we explore the modeling and analysis of complex social networks w...

متن کامل

Metadata Enrichment for Automatic Data Entry Based on Relational Data Models

The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...

متن کامل

Simple Equations for Predicting Entropy of Ammonia-Water Mixture

This work presents a set of three simple and explicit equations as a function of temperature, pressure, and mass fraction for calculation of the entropy of the ammonia-water mixture in saturated and super heated conditions. They are intended for use in the optimization and second law efficiency of absorption processes. The equations are constructed by the least square method for curve fitting u...

متن کامل

Semiprojectivity for Certain Purely Infinite C-algebras

It is proved that classifiable simple separable nuclear purely infinite C∗algebras having finitely generated K-theory and torsion-free K1 are semiprojective. This is accomplished by exhibiting these algebras as C∗-algebras of infinite directed graphs.

متن کامل

Learning infinite mixture of networks

We propose a Bayesian method to discover entangled directed graphs from scratch data. This method can be applied to gene regulation study and other applications. We show that an EM approach can recover a fixed number of components. Using a Dirichlet process mixture model, it is also possible to discover infinite mixture of causality relationships.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008